Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add limit to total number of fields in mapping #17357

Closed
wants to merge 1 commit into from

Conversation

yanjunh
Copy link
Contributor

@yanjunh yanjunh commented Mar 27, 2016

This is to prevent mapping explosion when dynamic keys such as UUID are used as field names. index.mapping.total_fields.limit specifies the total number of fields an index can have. An exception will be thrown when the limit is reached. The default limit is 1000. Value 0 means no limit. This setting is runtime adjustable

This is to prevent mapping explosion when dynamic keys such as UUID are used as field names. index.mapping.total_fields.limit specifies the total number of fields an index can have. An exception will be thrown when the limit is reached. The default limit is 1000. Value 0 means no limit. This setting is runtime adjustable
@yanjunh
Copy link
Contributor Author

yanjunh commented Mar 27, 2016

related to #17203
#17357

@s1monw
Copy link
Contributor

s1monw commented Mar 29, 2016

I wonder if we really need to make 0 a special value, why can't folks just use Long.MAX_VALUE or any high number? I also wonder if this PR can reject documents that have been accepted previously on the primary and trigger a mapping update on the replica due to some cluster state delays. This might also reject mapping merges coming from the master due to concurrent indexing? @jpountz should know more here, in any case I guess we need to make this check higher up the stack I suspect.

@s1monw s1monw added >enhancement review :Search/Mapping Index mappings, including merging and defining field types v5.0.0-alpha1 labels Mar 29, 2016
@@ -403,6 +406,13 @@ private void checkNestedFieldsLimit(Map<String, ObjectMapper> fullPathObjectMapp
}
}

private void checkTotalFieldsLimit(long totalMappers) {
long allowedTotalFields = indexSettings.getValue(INDEX_MAPPING_TOTAL_FIELDS_LIMIT_SETTING);
if (allowedTotalFields > 0 && allowedTotalFields < totalMappers) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like Simon suggested, I would just do: if (allowedTotalFields < totalMappers) {. If someone needs to have many fields anyway and understands the issue, (s)he can still set index.mapping.total_fields.limit=1000000 for instance.

@jpountz
Copy link
Contributor

jpountz commented Mar 29, 2016

@simonw The code has the check under the following if statement: if (reason == MergeReason.MAPPING_UPDATE). This means that the mapping is being merged due to a call to the update mapping API. So only the master node performs these checks, and only when there is a call to the update mapping API. In all other cases, like on data nodes or on the master node when it restores mappings from disk, the MergeReason would be MergeReason.MAPPING_RECOVERY so the check will be skipped. So I think we are fine?

@s1monw
Copy link
Contributor

s1monw commented Mar 29, 2016

@jpountz good can we add a comment there?

@jpountz
Copy link
Contributor

jpountz commented Mar 29, 2016

I will do it in a separate commit.

@jpountz
Copy link
Contributor

jpountz commented Mar 29, 2016

@simonw Done in c7bdfb1.

@s1monw
Copy link
Contributor

s1monw commented Mar 29, 2016

thx @jpountz

jpountz pushed a commit that referenced this pull request Mar 29, 2016
This is to prevent mapping explosion when dynamic keys such as UUID are used as field names. index.mapping.total_fields.limit specifies the total number of fields an index can have. An exception will be thrown when the limit is reached. The default limit is 1000. Value 0 means no limit. This setting is runtime adjustable

Closes #11443
@jpountz
Copy link
Contributor

jpountz commented Mar 29, 2016

@yanjunh It was so close that I applied the changes I suggested and then merged. I hope you don't mind. 361adcf

@yanjunh
Copy link
Contributor Author

yanjunh commented Mar 29, 2016

@jpountz Not at all. Thanks for merging this feature.

imotov added a commit that referenced this pull request Mar 29, 2016
…onIT#testDynamicUpdates

With restriction for the total number of fields introduced in #17357 this test can fail if a large number of records is randomly selected for indexing.
@segalziv
Copy link

@jpountz is there a chance that this PR can get back ported to 2.3.x ? I've discussed this with @yanjunh who pointed me to a fork 2.3.1 with this fix here: https://github.com/yanjunh/elasticsearch/tree/v2.3.1-maplimit
Thanks!

@clintongormley
Copy link

@segalziv no, this is a breaking change so we'll keep it in 5.0 only

@segalziv
Copy link

segalziv commented Apr 19, 2016

ok @clintongormley , thanks anyway

@thundercrawl
Copy link

@yanjunh Is this limitation is a hard value for ES usage ? because the setting will let us think it a red line, and in current USE CASE ,we have documents exceed 3000 fields, so we have to divide the document to different index? is there any bad things to exceed 10000 fields?

@yanjunh
Copy link
Contributor Author

yanjunh commented Nov 26, 2018

@thundercrawl The default limit is rather conservative. Depending on the resource (especially memory) ES nodes have, you can config much large limit. In practice, I have seen people using 10000 without problem.

@thundercrawl
Copy link

@yanjunh Great thanks, that helpful for us to make a design

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>breaking :Search/Mapping Index mappings, including merging and defining field types v5.0.0-alpha1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants